# *Contextual Bandits with Knapsacks for a Conversion Model* (Code for Numercial Study)
***

## Introduction
This is an instruction to reproduce the Numerical Study Results in **Appendix F** of the article 
**Contextual Bandits with Knapsacks for a Conversion Model**. 

In the code, we use the name **Logistic Bandits** to refer to **Box C adaptive policy (specific to the conversion model)** 
and **Linear Bandits** to refer to **Box D adaptive policy (for linear CBwK)**.

---

## Step 0: Prepare the Environment and Download the Underlying Data
1. Unzip the *numerical_study.zip* and use the unzipped folder *numerical\_study* as **root directory**

2. Setup the Python environment:
    - Python Version: 3.9
    - Dependence Package: In *requirements.txt*

---

## Step 1: Prepare the dataset
- Download the underlying dataset from 
[UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients)
and save it as *./data/default of credit card clients.xls*


- Open the script *1\_Prepare\_Data.py* and verify/modify the section **AREA OF INPUT PARAMETERS** if needed 
(see code comments for the explaination of each paramters) as following:

~~~
PATH = "." # Root directory, should be the same path this "README.md" file locates
PATH_DATA = f"{PATH}/data" # Path for data
PATH_MODELS = f"{PATH}/models" # Path for models

retrain_model = False # Default False. If retrain_model is True, it will retrain the PD model. If False, it will load the pretrained model
random_seed = 1989 # Random seed, used when retrain_model is True. To reproduce the PD model, set as 1989
~~~

- Run the script *1\_Prepare\_Data.py*

Note: get the prepared data in *./data/dt\_env.parq*

---

## Step 2: Get the **Optimal Static Policy**
- Open the script *2\_Optimal\_Static\_Policy.py* and verify/modify the section **AREA OF INPUT PARAMETERS** if needed as following:

~~~
##### Environment Parameters
PATH = "." # Root directory, should be the same path this "README.md" file locates
PATH_DATA = f"{PATH}/data" # Path for data
PATH_MODELS = f"{PATH}/models" # Path for models

##### Parameters for Bandits
size_norm = 50000 # T, use 50000 to reproduce the results
budget = 1600 # Budget constraint, to reproduce the results, run with 1600 and 2200
random_seed = 1990 # To reproduce the results, use 1990
~~~

- Run the script *2\_Optimal\_Static\_Policy.py* **twice** with parameter
`budget = 1600` and `budget = 2200`

Note: Get the **Optimal Static Policy** and 
**OPT / Costs** in corresponding files, for example: 
*./models/budget\_1600/policy\_optim\_static.pkl* and *./models/budget\_1600/dict\_expect\_optim\_reward\_costs.pkl*

---

## Step 3 (*Optional*):  Hyperparameters Tuning for *Box C adaptive policy (specific to the conversion model)* and *Box D adaptive policy (for linear CBwK)*
This step is **Optional** and the **Hyperparameters** used for the **Numercial Study** are already provided in 
*./models/budget\_1600/dict\_hyper.pkl* and *./models/budget\_2200/dict\_hyper.pkl* for corresponding budget constraints. 

**WARNING:**
Rerun this step may overwrite the original files of **Hyperparameters**

### Step 3a: Hyperparamters Tuning, except **eta** for **Box D adaptive policy (for linear CBwK)**
- Open the script *3a\_Hyperparameters\_Except\_OCO.py* and verify/modify the section **AREA OF INPUT PARAMETERS** if needed

- Run the script *3a\_Hyperparameters\_Except\_OCO.py* **twice** with paramter
`budget = 1600` and `budget = 2200`

Note: Get the **Selected Hyperparameters** in corresponding files: 
*./models/budget\_1600/dict\_hyper.pkl* and *./models/budget\_2200/dict\_hyper.pkl*

### Step 3b: Run **Box D adaptive policy (for linear CBwK)** with different **etas**
- Open the script *3b\_Linear\_Bandits\_Tuning.py* and verify/modify the section **AREA OF INPUT PARAMETERS** if needed:

- Run the script *3b\_Linear\_Bandits\_Tuning.py* **six times** with parameters
`budget = 1600` and `budget = 2200` cross with `random_seed` with 1989, 1990 and 1991

Note: Get the trained model in corresponding files, for example:
*./models/budget\_1600/tuning\_random\_seed\_1989/linear\_bandits\_eta01\_C01.pkl*

### Step 3c: Update the Hyperparamters with selected eta for **Box D adaptive policy (for linear CBwK)**
- Open the script *3c\_Hyperparameters\_OCO\_Eta.py* and verify/modify the section **AREA OF INPUT PARAMETERS** if needed:

- Run the script *3c\_Hyperparameters\_OCO\_Eta.py*

Note: Get the updated **Hyperparameters** in corresponding files, for example:
*./models/budget\_1600/dict\_hyper.pkl* and *./models/budget\_2200/dict\_hyper.pkl*

---

## Step 4: Simulate with Optimal Static Policy, *Box C adaptive policy (specific to the conversion model)*, and *Box D adaptive policy (for linear CBwK)*
- Open the script *4\_Logistic\_Linear\_Bandits\_Simulations.py* and verify/modify the section **AREA OF INPUT PARAMETERS** if needed

- Run the script *4\_Logistic\_Linear\_Bandits\_Simulations.py* **twenty times**: 
  - **Ten times** with parameters `budget = 1600` cross with `random_seed` from 1989 to 1998
  - **Ten times** with parameters `budget = 2200` cross with `random_seed` from 1989 to 1998

Note: Get the **Simulation Results** in corresponding files in *./numerical_study/models*, for example:

*./models/budget\_1600/random\_seed\_1989/policy\_optim\_static.pkl* <br/>
*./models/budget\_1600/random\_seed\_1989/logistic\_bandits\_C01.pkl* <br/>
*./models/budget\_1600/random\_seed\_1989/linear_bandits\_C01.pkl*

---

## Step 5: Collect Simulation Results and Generate the Graphs in **Appendix F**
1. Create the folder *./pics*

2. Open the script *5\_Plot\_Performance\_Algorithms.py* and verify/modify the section **AREA OF INPUT PARAMETERS** if needed

3. Run the script *5\_Plot\_Performance\_Algorithms.py* and get the **graphs** in 
*./pics/plot\_performance\_with\_legend.pdf*

---